148 research outputs found
Towards A Unified Policy Abstraction Theory and Representation Learning Approach in Markov Decision Processes
Lying on the heart of intelligent decision-making systems, how policy is
represented and optimized is a fundamental problem. The root challenge in this
problem is the large scale and the high complexity of policy space, which
exacerbates the difficulty of policy learning especially in real-world
scenarios. Towards a desirable surrogate policy space, recently policy
representation in a low-dimensional latent space has shown its potential in
improving both the evaluation and optimization of policy. The key question
involved in these studies is by what criterion we should abstract the policy
space for desired compression and generalization. However, both the theory on
policy abstraction and the methodology on policy representation learning are
less studied in the literature. In this work, we make very first efforts to
fill up the vacancy. First, we propose a unified policy abstraction theory,
containing three types of policy abstraction associated to policy features at
different levels. Then, we generalize them to three policy metrics that
quantify the distance (i.e., similarity) of policies, for more convenient use
in learning policy representation. Further, we propose a policy representation
learning approach based on deep metric learning. For the empirical study, we
investigate the efficacy of the proposed policy metrics and representations, in
characterizing policy difference and conveying policy generalization
respectively. Our experiments are conducted in both policy optimization and
evaluation problems, containing trust-region policy optimization (TRPO),
diversity-guided evolution strategy (DGES) and off-policy evaluation (OPE).
Somewhat naturally, the experimental results indicate that there is no a
universally optimal abstraction for all downstream learning problems; while the
influence-irrelevance policy abstraction can be a generally preferred choice.Comment: Preprint versio
Dynamics in direct two-photon transition by frequency combs
Two-photon resonance transition technology has been proven to have a wide
range of applications,it's limited by the available wavelength of commercial
lasers.The application of optical comb technology with direct two-photon
transition (DTPT) will not be restricted by cw lasers.This article will further
theoretically analyze the dynamics effects of the DTPT process driven by
optical frequency combs. In a three-level atomic system, the population of
particles and the amount of momentum transfer on atoms are increased compared
to that of the DTPT-free process. The 17% of population increasement in 6-level
system of cesium atoms has verified that DTPT process has a robust enhancement
on the effect of momentum transfer. It can be used to excite the DTPTs of
rubidium and cesium simultaneously with the same mode-locked laser. And this
technology has potential applications in cooling different atoms to obtain
polar cold molecules, as well as high-precision spectroscopy measurement.Comment: 7 pages, 7 figure
- …